Goto

Collaborating Authors

 dynamic learning


Dynamic Learning in Large Matching Markets

Neural Information Processing Systems

We study a sequential matching problem faced by large centralized platforms where jobs must be matched to workers subject to uncertainty about worker skill proficiencies. Jobs arrive at discrete times with job-types observable upon arrival. To capture the choice overload phenomenon, we posit an unlimited supply of workers where each worker is characterized by a vector of attributes (aka worker-types) drawn from an underlying population-level distribution. The distribution as well as mean payoffs for possible worker-job type-pairs are unobservables and the platform's goal is to sequentially match incoming jobs to workers in a way that maximizes its cumulative payoffs over the planning horizon. We establish lower bounds on the regret of any matching algorithm in this setting and propose a novel rate-optimal learning algorithm that adapts to aforementioned primitives online. Our learning guarantees highlight a distinctive characteristic of the problem: achievable performance only has a second-order dependence on worker-type distributions; we believe this finding may be of interest more broadly.


Learning Visually Interpretable Oscillator Networks for Soft Continuum Robots from Video

Krauss, Henrik, Licher, Johann, Takeishi, Naoya, Raatz, Annika, Yairi, Takehisa

arXiv.org Artificial Intelligence

Data-driven learning of soft continuum robot (SCR) dynamics from high-dimensional observations offers flexibility but often lacks physical interpretability, while model-based approaches require prior knowledge and can be computationally expensive. We bridge this gap by introducing (1) the Attention Broadcast Decoder (ABCD), a plug-and-play module for autoencoder-based latent dynamics learning that generates pixel-accurate attention maps localizing each latent dimension's contribution while filtering static backgrounds. (2) By coupling these attention maps to 2D oscillator networks, we enable direct on-image visualization of learned dynamics (masses, stiffness, and forces) without prior knowledge. We validate our approach on single- and double-segment SCRs, demonstrating that ABCD-based models significantly improve multi-step prediction accuracy: 5.7x error reduction for Koopman operators and 3.5x for oscillator networks on the two-segment robot. The learned oscillator network autonomously discovers a chain structure of oscillators. Unlike standard methods, ABCD models enable smooth latent space extrapolation beyond training data. This fully data-driven approach yields compact, physically interpretable models suitable for control applications.



Efficient Manifold-Constrained Neural ODE for High-Dimensional Datasets

Guo, Muhao, Li, Haoran, Weng, Yang

arXiv.org Artificial Intelligence

Abstract--Neural ordinary differential equations (NODE) have garnered significant attention for their design of continuous-depth neural networks and the ability to learn data/feature dynamics. However, for high-dimensional systems, estimating dynamics requires extensive calculations and suffers from high truncation errors for the ODE solvers. T o address the issue, one intuitive approach is to consider the non-trivial topological space of the data distribution, i.e., a low-dimensional manifold. Existing methods often rely on knowledge of the manifold for projection or implicit transformation, restricting the ODE solutions on the manifold. Nevertheless, such knowledge is usually unknown in realistic scenarios. Therefore, we propose a novel approach to explore the underlying manifold to restrict the ODE process. Specifically, we employ a structure-preserved encoder to process data and find the underlying graph to approximate the manifold. Moreover, we propose novel methods to combine the NODE learning with the manifold, resulting in significant gains in computational speed and accuracy. Our experimental evaluations encompass multiple datasets, where we compare the accuracy, number of function evaluations (NFEs), and convergence speed of our model against existing baselines. Our results demonstrate superior performance, underscoring the effectiveness of our approach in addressing the challenges of high-dimensional datasets. Understanding and modeling the dynamics of complex systems is a fundamental challenge in various fields, including physics [1], [2], biology [3], engineering [4], natural language processing [5]-[7], and large language models [8].


Action Flow Matching for Continual Robot Learning

Murillo-Gonzalez, Alejandro, Liu, Lantao

arXiv.org Artificial Intelligence

Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks, mirroring human adaptability. A key challenge is refining dynamics models, essential for planning and control, while addressing issues such as safe adaptation, catastrophic forgetting, outlier management, data efficiency, and balancing exploration with exploitation -- all within task and onboard resource constraints. Towards this goal, we introduce a generative framework leveraging flow matching for online robot dynamics model alignment. Rather than executing actions based on a misaligned model, our approach refines planned actions to better match with those the robot would take if its model was well aligned. We find that by transforming the actions themselves rather than exploring with a misaligned model -- as is traditionally done -- the robot collects informative data more efficiently, thereby accelerating learning. Moreover, we validate that the method can handle an evolving and possibly imperfect model while reducing, if desired, the dependency on replay buffers or legacy model snapshots. We validate our approach using two platforms: an unmanned ground vehicle and a quadrotor. The results highlight the method's adaptability and efficiency, with a record 34.2\% higher task success rate, demonstrating its potential towards enabling continual robot learning. Code: https://github.com/AlejandroMllo/action_flow_matching.


Supplementary Material for Rethinking Value Function Learning for Generalization in Reinforcement Learning A Stiffness Analysis

Neural Information Processing Systems

The green lines in Figure 1 demonstrate that the stiffness decreases as the number of training levels increases in most of the Procgen games. This suggests that the delayed critic update effectively alleviates the memorization problem. Each agent is trained on 200 training levels for 25M environment steps. Each agent is trained for 8M environment steps. The mean is computed over 10 different runs.


Learning Causal Structure Distributions for Robust Planning

Murillo-Gonzalez, Alejandro, Xu, Junhong, Liu, Lantao

arXiv.org Artificial Intelligence

Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.


Dynamic Learning in Large Matching Markets

Neural Information Processing Systems

We study a sequential matching problem faced by "large" centralized platforms where "jobs" must be matched to "workers" subject to uncertainty about worker skill proficiencies. Jobs arrive at discrete times with "job-types" observable upon arrival. To capture the "choice overload" phenomenon, we posit an unlimited supply of workers where each worker is characterized by a vector of attributes (aka "worker-types") drawn from an underlying population-level distribution. The distribution as well as mean payoffs for possible worker-job type-pairs are unobservables and the platform's goal is to sequentially match incoming jobs to workers in a way that maximizes its cumulative payoffs over the planning horizon. We establish lower bounds on the "regret" of any matching algorithm in this setting and propose a novel rate-optimal learning algorithm that adapts to aforementioned primitives "online."


Dynamic Learning in Large Matching Markets

Neural Information Processing Systems

We study a sequential matching problem faced by "large" centralized platforms where "jobs" must be matched to "workers" subject to uncertainty about worker skill proficiencies. Jobs arrive at discrete times with "job-types" observable upon arrival. To capture the "choice overload" phenomenon, we posit an unlimited supply of workers where each worker is characterized by a vector of attributes (aka "worker-types") drawn from an underlying population-level distribution. The distribution as well as mean payoffs for possible worker-job type-pairs are unobservables and the platform's goal is to sequentially match incoming jobs to workers in a way that maximizes its cumulative payoffs over the planning horizon. We establish lower bounds on the "regret" of any matching algorithm in this setting and propose a novel rate-optimal learning algorithm that adapts to aforementioned primitives "online."


Jointly Exploring Client Drift and Catastrophic Forgetting in Dynamic Learning

Babendererde, Niklas, Fuchs, Moritz, Gonzalez, Camila, Tolkach, Yuri, Mukhopadhyay, Anirban

arXiv.org Artificial Intelligence

Federated and Continual Learning have emerged as potential paradigms for the robust and privacy-aware use of Deep Learning in dynamic environments. However, Client Drift and Catastrophic Forgetting are fundamental obstacles to guaranteeing consistent performance. Existing work only addresses these problems separately, which neglects the fact that the root cause behind both forms of performance deterioration is connected. We propose a unified analysis framework for building a controlled test environment for Client Drift -- by perturbing a defined ratio of clients -- and Catastrophic Forgetting -- by shifting all clients with a particular strength. Our framework further leverages this new combined analysis by generating a 3D landscape of the combined performance impact from both. We demonstrate that the performance drop through Client Drift, caused by a certain share of shifted clients, is correlated to the drop from Catastrophic Forgetting resulting from a corresponding shift strength. Correlation tests between both problems for Computer Vision (CelebA) and Medical Imaging (PESO) support this new perspective, with an average Pearson rank correlation coefficient of over 0.94. Our framework's novel ability of combined spatio-temporal shift analysis allows us to investigate how both forms of distribution shift behave in mixed scenarios, opening a new pathway for better generalization. We show that a combination of moderate Client Drift and Catastrophic Forgetting can even improve the performance of the resulting model (causing a "Generalization Bump") compared to when only one of the shifts occurs individually. We apply a simple and commonly used method from Continual Learning in the federated setting and observe this phenomenon to be reoccurring, leveraging the ability of our framework to analyze existing and novel methods for Federated and Continual Learning.